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Abstract 

The  increased  use  of  botnets  as  an  attack  tool  and  the 
awareness  attackers  have  of  blocking  lists  leads  to  the 
question  of  whether  we  can  effectively  predict  future  hot 
locations.  To  that  end,  we  introduce  a  network  quality 
that  we  term  uncleanliness:  an  indicator  of  the  propen¬ 
sity  for  hosts  in  a  network  to  be  compromised  by  outside 
parties. 

We  hypothesize  that  unclean  networks  will  demon¬ 
strate  two  properties:  spatial  and  temporal  uncleanli¬ 
ness.  Spatial  uncleanliness  is  the  tendency  for  com¬ 
promised  hosts  to  cluster  more  densely  within  unclean 
networks.  Temporal  uncleanliness  is  the  tendency  for 
unclean  networks  to  contain  compromised  hosts  for  ex¬ 
tended  periods. 

We  test  for  these  properties  by  collating  data  from 
multiple  indicators  (spamming,  phishing,  scanning  and 
botnet  IRC  log  monitoring).  We  demonstrate  evidence 
for  both  spatial  and  temporal  uncleanliness.  We  further 
show  evidence  for  cross-relationship  between  the  vari¬ 
ous  datasets,  showing  that  botnet  activity  predicts  spam¬ 
ming  and  scanning,  while  phishing  activity  appears  to  be 
unrelated  to  the  other  indicators. 

1  Introduction 

Botnets  are  a  common  attack  tool  due  to  the  anonymity 
and  flexibility  that  they  provide  attackers.  Modern  bots 
can  be  used  for  DDoS,  spamming,  infiltration  of  local 
networks,  key-logging  and  other  criminal  acts  [5,  15]. 
Past  research,  notably  by  Mirkovic  etal.  [18],  has  shown 
that  botnet  based  attacks  can  be  divided  into  distinct 
phases  of  acquisition  and  use. 

We  expect  that  bot  acquisition  is  effectively  oppor¬ 


tunistic  [2]:  while  attackers  may  avoid  certain  net¬ 
works  [24],  in  the  majority  of  cases,  attackers  have  no 
interest  or  knowledge  about  targets  except  that  the  tar¬ 
get  is  vulnerable.  With  automatically  propagating  attack 
tools,  an  attacker  may  not  know  about  the  existence  of  a 
target  until  after  he  compromises  it. 

As  bot  software  has  become  more  sophisticated  and 
flexible,  it  is  now  reasonable  to  expect  that  any  publicly 
accessible  host  on  the  Internet  will  be  attacked  by  ev¬ 
ery  common  method  within  a  short  period  (  for  example, 
specific  variants  of  Gaobot  can  spread  themselves  using 
network  shares,  AOL  Instant  Messenger,  and  multiple 
Windows  vulnerabilities1). 

We  therefore  expect  that  within  a  short  time,  a  host 
will  be  attacked  by  every  possible  means  of  compro¬ 
mise2.  If  we  assume  that  an  attacker  cannot  distinguish 
between  the  hosts  within  a  network,  then  he  has  an  equal 
chance  of  attacking  any  of  them.  In  addition,  with  no 
advance  knowledge  of  what  the  target  is  vulnerable  to, 
an  attacker  will  use  all  attacks  available  to  him.  Conse¬ 
quently,  the  probability  that  a  machine  will  be  compro¬ 
mised  during  some  period  is  not  a  function  of  that  host’s 
attacker,  but  of  its  defenders. 

We  hypothesize  that  networks  have  a  property,  which 
we  term  uncleanliness  which  is  an  indicator  of  the 
propensity  that  hosts  within  a  network  will  be  compro¬ 
mised.  Our  intuition  is  as  follows:  consider  two  in¬ 
stitutions  with  different  defensive  postures.  Institution 
A  maintains  an  aggressive  firewall  policy,  disables  all 


'http: / /www . Symantec . com/ enterprise/ 
security-response /writeup . j  sp?docid= 
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2A  report  of  the  expected  time  between  attacks  for  spe¬ 
cific  vulnerabilities  is  available  at  http://isc.sans.org/ 
survivaltime  .  html;  the  interval  between  attacks  for  the  average 
address  is  on  the  order  of  20  minutes 
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email  attachments,  maintains  all  files  on  a  central  server 
and  restores  all  hosts  on  the  network  from  a  ghosted 
state  each  night.  Institution  B  has  no  central  inventory 
of  machines,  runs  a  variety  of  hardware  and  software 
installations  that  administrators  might  not  even  know 
about,  has  a  large  number  of  self-administered  machines 
and  no  firewall.  We  would  expect  that  institution  A 
would  be  less  vulnerable  to  attacks,  and  that  if  a  ma¬ 
chine  was  compromised,  it  would  be  restored  to  its  orig¬ 
inal  state  quickly.  Conversely,  machines  in  institution  B 
will  be  reached  by  larger  number  of  attacks,  and  when 
a  machine  is  compromised,  it  may  not  be  noticed  or  re¬ 
paired  until  long  after  the  compromise  has  taken  place. 

We  can  estimate  the  uncleanliness  of  a  network  by 
examining  its  result:  once  an  attacker  has  compromised 
hosts,  he  will  use  them  for  criminal  activities.  If  a  host 
is  compromised,  we  expect  that  the  attacker  will  use 
it  to,  for  example,  spam,  scan  and  DDoS  networks.  If 
uncleanliness  is  a  network-specific  property,  we  expect 
that  compromised  hosts  will  congregate  in  specific  net¬ 
works,  which  we  quantify  via  the  phenomena  of  spatial 
and  temporal  uncleanliness.  Note  that  uncleanliness  is 
a  network  level  property:  hosts  are  compromised,  net¬ 
works  are  unclean. 

We  define  spatial  uncleanliness  as  a  tendency  for 
compromised  hosts  to  cluster  in  unclean  networks.  Spa¬ 
tial  uncleanliness  implies  that  if  we  see  a  host  engaged 
in  hostile  activity  (such  as  scanning),  we  have  a  good 
chance  of  finding  another  IP  address  in  the  same  net¬ 
work  engaged  in  hostile  activity.  We  will  test  for  spatial 
uncleanliness  by  examining  the  clustering  of  addresses 
within  networks. 

We  define  temporal  uncleanliness  as  a  tendency  for 
compromised  hosts  to  repeatedly  appear  in  unclean  net¬ 
works.  Temporal  uncleanliness  implies  that  if  a  host  is 
compromised,  then  other  hosts  within  that  network  will 
be  compromised  in  the  future.  We  will  test  for  temporal 
uncleanliness  by  examining  the  ability  of  unclean  net¬ 
works  to  predict  future  host  compromises. 

Figure  1  confirms  our  intuition  for  spatial  uncleanli¬ 
ness  and  temporal  uncleanliness.  This  figure  shows  two 
plots:  the  upper  counts  the  number  of  unique  hosts  scan¬ 
ning  a  large  network  from  January  to  April,  2006.  The 
lower  plot  is  a  plot  showing  how  many  of  these  scanning 
addresses  were  also  present  in  a  botnet  reported  during 
the  first  week  of  March,  2006.  This  plot  contains  two 
lines:  one  counts  the  number  of  unique  addresses  from 
the  bot  report  which  were  also  identified  scanning;  the 
second  counts  the  number  of  unique  addresses  from  the 
bot  report  which  were  present  in  a  24-bit  CIDR  block 
where  at  least  one  address  was  also  scanning. 


First  note  that  these  reports  resulted  from  two  differ¬ 
ent  detection  methods:  the  bots  were  collected  by  ob¬ 
serving  IP  addresses  communicating  on  IRC  channels, 
while  scanning  data  was  collected  using  a  behavioral 
scan  detection  method  deployed  on  an  observed  network 
[6].  Despite  this,  there  is  a  strong  intersection  between 
the  two  sets:  at  its  peak,  35%  of  the  botnet’s  addresses 
are  scanning  the  observed  network. 

Second,  we  observe  that  using  the  /24’s  comprising 
the  botnet  identifies  more  scanners  than  the  botnet  ad¬ 
dresses  alone;  this  value  ranges  between  a  25%  increase 
and  4  times  as  many  addresses  depending  on  the  activity. 
We  demonstrate  in  §4  that  these  results  are  statistically 
significant. 

Finally,  Figure  1  also  explains  our  intuition  for  tempo¬ 
ral  uncleanliness.  As  this  figure  shows,  abnormal  scan¬ 
ning  (and  therefore  botnet  compromise)  occurs  over  sev¬ 
eral  weeks.  If  bots  take  several  weeks  to  be  identified 
and  removed,  we  expect  that  an  unclean  network  will  be 
unclean  for  some  duration,  and  therefore  we  can  predict 
future  hostile  activity  from  the  same  network. 

In  this  paper,  we  examine  four  potential  indicators 
of  uncleanliness:  botnet  data,  scanning  activity,  spam¬ 
ming  and  phishing.  We  collect  reports  of  unclean  activ¬ 
ity  from  multiple  sources:  public  mailing  lists  and  web 
sites,  private  studies,  and  by  examining  traffic  crossing 
a  large  (multiple  /8)  network. 

The  primary  contribution  of  this  paper  is  an  empirical 
study  of  uncleanliness  and  its  use  as  a  predictive  aid.  We 
test  for  the  existence  of  spatial  and  temporal  unclean¬ 
liness  by  comparing  the  traffic  from  various  reports  of 
hostile  activity.  We  demonstrate  that  compromised  hosts 
are  both  more  densely  clustered  than  normal  traffic  and 
predict  future  unclean  activity.  In  addition,  we  show  that 
scanning,  spamming  and  botnet  activity  shows  evidence 
of  cross  relationship,  such  as  the  scanning  observed  in 
Figure  1.  We  also  show  that  while  these  phenomena 
do  not  predict  future  phishing  sites,  past  phishing  sites 
do,  therefore  demonstrating  that  temporal  uncleanliness 
holds  for  all  four  indicators.  We  then  test  the  strength  of 
this  predictive  mechanism  by  evaluating  its  suitability  to 
block  traffic  crossing  a  large  network.  We  demonstrate 
that  limited  predictive  blocking  is  feasible,  due  to  the 
impact  of  locality  [17]  evident  in  network  traffic. 

The  remainder  of  this  paper  is  structured  as  follows: 
§2  outlines  relevant  previous  work  in  reputation  man¬ 
agement  and  identifying  hostile  groups  by  past  history. 
§3  describes  our  model  and  the  data  sources  we  use  in 
this  paper.  §4  examines  the  spatial  uncleanliness  hy¬ 
pothesis,  and  §5  examines  the  temporal  uncleanliness 
hypothesis.  §6  examines  the  impact  of  blocking  unclean 
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Figure  1:  Relationship  between  scanning  and  botnet  population 


networks,  and  §7  discusses  the  results. 

2  Previous  Work 

Researchers  initially  studied  botnets  due  to  their  use  in 
DDoS  attacks;  in  this  domain,  Mirkovic  et  al.  [18]  de¬ 
fined  a  DDoS  attack  as  a  two-phase  process:  acquiring 
hosts  to  use  for  the  DDoS  and  then  using  those  hosts  to 
conduct  an  attack.  Freiling  et  al.  [5]  identify  a  variety  of 
other  attacks  that  botnets  can  conduct  efficiently,  Collins 
et  al.  [2]  define  hot  occupation  attacks  as  conducted  by 
opportunistic  attackers:  that  is,  the  attacker  has  no  inter¬ 
est  or  knowledge  of  the  target  except  that  the  target  is 
exploitable.  Our  work  uses  these  concepts  to  study  the 
impact  of  largely  automated  acquisition  and  its  impact 
on  network  defense. 

Botnet  demographics  have  been  studied  using  Hon- 
eypots  and  by  actively  probing  hot  networks  [8,  9,  21]. 
Rajand  et  al.' s  [21]  analysis  is  particularly  relevant  due 
to  the  extended  period  during  which  they  observed  net¬ 
work  traffic,  allowing  them  to  identify  not  only  botnet 
demographics  but  activity.  Our  work  differs  from  these 
analyses  by  comparing  multiple  observed  phenomena 


and  using  this  information  to  predict  future  activity. 

In  operational  security,  blacklists  are  commonly  used 
to  identify  and  block  hosts  that  are  already  assumed 
to  be  hostile.  Examples  of  such  blacklists  include 
Spamhaus’  ZEN  list  [20]  and  the  Bleeding  Snort  rule 
set  [23].  Researchers  such  as  Levy  [16]  note  that  spam¬ 
mers  increasingly  rely  on  the  use  of  occupied  hosts  to 
generate  spam  messages  -  these  approaches  are  more 
attractive  to  spammers  because  they  offload  processing 
requirements  from  the  spammer  (as  noted  by  Laurie  et 
a/.[15])  and  because  they  hide  the  attacker’s  identity[4]. 

In  addition,  researchers  have  studied  the  impact  of 
blacklists  on  spamming  and  other  hostile  activity  Jung  et 
al.  [12]  compare  spamming  blacklists  against  spam  traf¬ 
fic  to  MIT  in  2000  and  2004,  finding  that  in  2004,  80% 
of  spammers  were  identified  by  blacklists.  Ramachan- 
dran  et  al.  [22],  examine  Blacklist  abuse  by  botnet  own¬ 
ers.  Ramachandran  notes  that  botnet  owners  appear  to 
place  a  higher  premium  on  addresses  not  present  on 
block  lists.  Since  uncleanliness  is  intended  to  predict 
future  hostile  addresses,  this  may  impact  the  costs  noted 
by  Ramachandran. 

McHugh  et  al.  use  locality  to  characterize  normal 
network  behavior  and  differentiate  attacks.  Krishna- 
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murthy  et  al.  [14]  use  netblocks  to  characterize  target 
audiences  for  networks,  and  demonstrate  that  many  sites 
have  common  audiences.  This  leads  to  a  generalized 
netblock-level  approach  developed  by  Jung  et  al.  [10] 
for  DDoS  defense.  These  methods  of  blocking  are  pred¬ 
icated  on  the  assumption  that  attack  traffic  differs  from 
normal  traffic  due  to  a  limited  and  clustered  audience  for 
any  normal  service.  Our  filtering  approach  differs  from 
the  past  history  used  in  these  cases  by  developing  a  set 
of  explicitly  untrusted  networks. 

3  Source  Data 

We  demonstrate  evidence  of  uncleanliness  by  showing 
that  address  distributions  from  unclean  data  sets  show 
specific  qualities;  in  order  to  do  so,  we  must  collate  in¬ 
formation  from  various  sources  with  different  collection 
methods.  In  this  section  we  describe  a  simple  taxon¬ 
omy  and  notation  scheme  for  managing  our  data;  in  the 
following  sections  we  use  this  data  to  demonstrate  sig¬ 
nificance.  This  section  is  divided  as  follows,  §3.1  ex¬ 
plains  the  taxonomy  and  notation  for  reports,  and  §3.2 
describes  the  individual  reports. 

3.1  Model 

In  order  to  estimate  the  uncleanliness  of  a  network,  we 
must  compare  data  from  multiple  sources.  For  example, 
an  attacker  may  initially  use  a  bot  for  scanning,  then  for 
spamming.  We  call  these  sources  reports,  each  of  which 
consists  of  a  set  of  IP  addresses  describing  a  particular 
phenomenon  over  some  period.  Reports  differ  by  the 
class  of  data  reported,  the  period  covered  by  the  report, 
and  the  method  used  to  generate  that  data. 

We  use  four  classes  of  unclean  data  for  this  paper: 

1.  Bots:  An  IP  address  identified  as  hosting  some 
form  of  bot  software  or  communicating  with  a  bot¬ 
net  command  and  control  host. 

2.  Phishing:  An  IP  address  identified  as  hosting  a 
phishing  site  in  order  to  fraudulently  acquire  pri¬ 
vate  user  information. 

3.  Scanning:  An  IP  address  identified  as  scanning  us¬ 
ing  the  methods  developed  by  Gates  et  al.  [7]  and 
Jung  et  al.  [11]. 

4.  Spamming:  An  IP  address  identified  as  spamming 
using  a  behavioral  spam  detection  technique  3. 

"this  spam  detection  method  is  currently  under  review 


These  reports  all  describe  phenomena  associated  with 
compromised  hosts.  Scanning  and  spamming  are  both 
common  botnet  uses,  and  phishing  requires  setting  up  a 
fraudulent  web  site. 

We  further  divide  reports  as  either  provided  or  ob¬ 
served.  Provided  reports  are  collected  from  external  par¬ 
ties,  and  can  use  different  methodologies  to  observe  the 
same  effects.  For  example,  a  phishing  list  can  acquire 
IP  addresses  by  using  spam  traps  [19]  or  by  collecting 
user  reports,  (e.g.,  the  submission  form  at  the  Castle- 
Cops  PIRT  service  [1]).  For  the  analyses  within  this  pa¬ 
per,  we  use  only  one  source  per  report  and  assume  that 
the  source’s  collection  methodology  is  consistent  over 
the  report  period.  In  contrast  to  provided  reports,  ob¬ 
served  reports  are  generated  from  network  traffic  logs 
reporting  traffic  covering  a  large  edge  network. 

We  use  a  simple  notation  to  describe  all  reports;  each 
report  is  differentiated  by  a  tag  which,  for  this  paper, 
summarizes  the  period  and  source  for  the  report.  We 
express  this  using  the  notation  IZj.  In  this  form,  T  is 
the  tag  (e.g.,  scan).  A  list  of  reports  used  in  this  paper  is 
given  in  Table  1 . 

Because  we  expect  uncleanliness  to  be  a  network 
property,  we  define  a  CIDR  masking  function  C'n(i). 
The  CIDR  masking  function  evaluates  to  the  unique 
CIDR  block  with  prefix  length  n  that  contains  the  IP  ad¬ 
dress  i  (e.g.,  Ci6(127.1. 135.14)  =  127.1.0.0/16  ).  For 
convenience,  when  the  CIDR  masking  function  is  ap¬ 
plied  on  a  report  S,  the  result  is  set-valued  and  returns 
the  set  of  all  n-bit  CIDR  blocks  in  that  set,  that  is: 

cn(S)  =  |J  cn(i)  (i) 

ies 

When  determining  whether  or  not  an  IP  address  re¬ 
sides  within  a  set  of  CIDR  blocks,  we  will  use  a  CIDR 
inclusion  relation,  C,  to  indicate  that  an  IP  address  is 
resident  in  one  of  a  set  of  CIDR  blocks: 

i  c  S  — >  3n  s.t.  Cn(i)  €  S  (2) 

With  all  sets  and  reports,  we  use  bars  to  indicate  car¬ 
dinality,  i.e.,  \S\  is  the  number  of  elements  in  the  set  S. 

3.2  Reports 

Table  1  is  an  inventory  of  all  the  reports  used  in  this 
paper.  Recall  that  provided  reports  have  been  given  to 
us  by  other  parties  and  that  we  generate  observed  reports 
using  traffic  logs  from  the  observed  network.  Because 
we  have  greater  control  over  observed  reports,  we  can 
generate  these  reports  over  arbitrary  periods.  We  have 
less  control  over  when  we  receive  provided  reports. 
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The  observed  network  is  composed  of  over  20  million 
distinct  IPv4  addresses  and  contains  several  servers  that 
are  heavily  used  by  clients  across  the  Internet.  Given  the 
size  and  activity  of  the  observed  network,  we  assume 
that  IP  addresses  from  the  Internet  crossing  into  it  are  a 
representative  sample  of  the  Internet  as  a  whole. 

In  order  to  compensate  for  selection  bias  within  ob¬ 
served  reports,  all  reports  have  been  filtered  to  only  in¬ 
clude  addresses  which  are  outside  of  the  observed  net¬ 
work  and  are  not  otherwise  reserved  (e.g.,  all  addresses 
specified  in  RFC  1918  have  been  removed  from  reports). 

We  classify  four  of  the  reports  in  this  list  as  unclean 
reports,  these  are  the  reports  we  use  as  ground  truth 
for  identifying  the  four  indicators  discussed  in  §3.1: 
botnet  membership,  phishing  sites,  scanners  and  spam¬ 
mers.  During  the  two  week  period  of  October  1st- 14th, 
2006,  we  have  both  provided  and  observed  reports  on  all 
classes  of  unclean  activity,  consequently  we  use  October 
1st- 14th  to  test  temporal  uncleanliness. 

The  next  set  of  reports  are  used  specifically  to 
test  the  spatial  and  temporal  uncleanliness  hypotheses. 
The  bot  —  test  report  describes  a  small  botnet  from  5 
months  before  all  the  other  activity  analyzed  in  this  pa¬ 
per,  bot  —  test  is  used  as  an  extreme  case  for  prediction: 
if  a  five-month  old  report  can  accurately  predict  current 
unclean  activity,  then  a  more  recent  one  should  be  more 
effective. 

The  control  report  consists  of  47  million  unique  IP 
addresses  observed  during  the  week  of  September  25th, 
2006.  We  compare  the  data  from  our  other  reports 
against  randomly  generated  subsets  of  control  in  order 
to  determine  whether  or  not  these  reports  exhibit  spatial 
or  temporal  uncleanliness.  We  use  the  control  report  to 
more  accurately  reflect  the  structure  of  IPv4  space  than 
we  would  using  purely  randomly  chosen  IP  addresses. 
The  report  consists  of  IP  addresses  observed  to  engage 
in  payload-bearing  TCP  activity,  which  reduces  the  risk 
of  the  address  being  spoofed.  Furthermore,  as  noted  in 
§3.1,  the  observed  network  includes  a  variety  of  servers 
used  by  hosts  throughout  the  Internet,  and  by  focus¬ 
ing  exclusively  on  the  IP  addresses  of  the  hosts  with¬ 
out  using  any  criteria  apart  from  the  unspoofed  criterion, 
we  expect  the  resulting  report  to  approximate  a  random 
sample  of  active  IP  addresses  on  the  Internet. 

4  Spatial  Uncleanliness 

We  define  spatial  uncleanliness  as  the  propensity  for  oc¬ 
cupied  addresses  (bots)  to  be  clustered  in  unclean  net¬ 
works.  In  this  section,  we  formulate  and  test  the  spatial 
uncleanliness  hypothesis. 


This  section  is  divided  as  follows:  §4.1  describes  the 
methodology  used  to  test  for  spatial  uncleanliness.  §4.2 
describes  the  results  of  our  tests  and  shows  evidence  for 
spatial  uncleanliness. 

4.1  Model  and  methodology 

Recall  our  assumption  that  the  likelihood  of  a  host  being 
compromised  is  a  network  property:  if  a  network  is  un¬ 
clean,  then  its  administrators  will  not  identify  compro¬ 
mised  machines  or  rapidly  repair  them.  Consequently, 
we  expect  that  multiple  hosts  within  an  unclean  network 
will  be  compromised,  and  that  compromised  addresses 
will  cluster  within  unclean  networks.  In  order  to  test  this 
hypothesis,  we  will  compare  the  expected  population  of 
compromised  hosts  within  equally  sized  CIDR  blocks. 

To  test  for  spatial  uncleanliness,  we  begin  with  a  mea¬ 
surement  for  comparative  density.  If  we  have  two  sets. 
Si  and  S2,  and  |Si|  =  IS2I,  then  we  say  that  Si  is 
denser  at  n-bits  if  the  number  of  n-bit  CIDR  blocks 
that  Si  occupies  is  less  than  the  number  of  n-bit  CIDR 
blocks  occupied  by  S2. 

Throughout  this  paper,  we  use  homogeneously  sized 
CIDR  blocks  to  model  individual  networks.  While  other 
categorization  techniques  are  available  we  opt  to  use  ho¬ 
mogeneously  sized  CIDR  blocks  in  order  to  control  for 
population.  Heterogeneous  partitions,  such  as  Krishna- 
murthy  et  aids  network-aware  clustering  method  [14]), 
result  in  networks  that  differ  in  size  by  several  orders  of 
magnitude. 

In  §1,  we  stated  that  spatial  uncleanliness  implies  that 
if  a  host  is  compromised,  there  is  a  good  chance  an¬ 
other  host  on  the  same  network  will  be  compromised. 
Consequently,  if  we  had  a  set  of  compromised  host  ad¬ 
dresses,  and  a  control  set  of  randomly  selected  addresses 
with  equal  cardinality,  we  would  expect  that  the  com¬ 
promised  address  set  was  at  least  as  dense  at  all  CIDR 
prefix  lengths. 

We  therefore  summarize  the  spatial  uncleanliness  hy¬ 
pothesis  as  follows:  if  we  have  a  report  which  selects 
unclean  traffic  from  the  Internet,  7?.unciean.  then  the  IP 
addresses  within  that  report  will  be  more  densely  packed 
than  a  randomly  selected  set  of  IP  addresses  with  equal 
cardinality. 

To  test  the  spatial  uncleanliness  hypothesis,  we  use 
the  formulation  given  in  Equation  3  below.  Assuming 
that  we  have  two  reports,  7?.unc|ean  which  reports  on  un¬ 
clean  traffic,  and  '/^control  which  is  control  data,  and  both 
reports  are  of  equal  cardinality,  then: 

Vn  G  [16,32]  \Cn(Kundean)\  <  |C„(ftcontroi)|  (3) 
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Unclean  reports 

Tag 

Type 

Class 

Valid  Dates 

Size 

Reporting  method 

hot 

Provided 

Bots 

2006/10/01-2006/10/14 

621,861 

Bot  addresses  acquired  through  pri¬ 
vate  reports  from  a  third  party 

phish 

Provided 

Phishing 

2006/05/01-2006/11/01 

53,789 

Addresses  from  a  Phishing  report 
list 

scan 

Observed 

Scanning 

2006/10/01-2006/10/14 

151,908 

IP  addresses  scanning  the  observed 
network 

spam 

Observed 

Spam 

2006/10/01-2006/10/14 

397,306 

IP  addresses  spamming  the  ob¬ 
served  network 

Reports  for  hypothesis  testing 

hot  —  test 

Provided 

Bots 

2006/05/10 

186 

Botnet  addresses  acquired  through 
private  communication 

control 

Observed 

N/A 

2006/09/25-2006/10/02 

46,899,928 

Control  addresses  acquired  from  the 
observed  network 

Table  1 :  Table  of  tags  used  in  this  report 


Based  on  DDoS  filtering  work  done  by  Collins  and 
Reiter  [3],  we  expect  that  16  bit  prefix  lengths  will  be 
too  imprecise  for  effective  filtering  and  detection.  Con¬ 
sequently,  we  limit  our  prefix  lengths  to  between  16  and 
32  bits. 

4.2  Analysis 

In  order  to  test  the  spatial  uncleanliness  hypothesis,  as 
formulated  in  Equation  3,  we  compare  the  population  of 
addresses  per  ?r-bit  CIDR  blocks  for  an  unclean  report 
against  the  expected  population  for  n-bit  CIDR  blocks 
across  the  Internet  as  a  whole. 

As  discussed  in  §3.2  we  model  network  populations 
by  randomly  selecting  IP  addresses  from  the  7vLCOntroi  re¬ 
port.  Kohler  et  al.  [13]  observe  that  IP  addresses  are 
not  evenly  distributed  across  IPv4  space;  as  a  conse¬ 
quence,  a  purely  random  model  will  result  in  an  arti¬ 
ficially  depressed  density  estimate.  To  compensate  for 
this,  we  test  two  population  estimates.  The  first,  naive, 
estimate  selects  addresses  evenly  from  across  all  /8’s 
which  are  listed  as  populated  by  IANA4.  The  second, 
empirical,  density  estimate  draws  addresses  from  a  pool 
of  addresses  observed  to  cross  the  network  under  obser¬ 
vation  from  the  week  of  September  25th-October  1st, 
2006.  In  the  empirical  estimate,  we  create  1000  ran¬ 
domly  generated  subsets  of  '/^control  and  group  the  re¬ 
sulting  addresses. 

Figure  2  plots  the  number  of  addresses  per  block 

4http : //www . iana . org/assignment s/ 
ipv4- address- space 


for  CIDR  block  prefix  lengths  of  16  to  32  bits.  This 
plot  compares  the  botnet  density,  7^bot-  against  both 
the  empirical  and  naive  density  estimates  of  equal  size 
(621,861  addresses,  as  per  Table  1).  As  this  figure 
shows,  the  botnet  population  is  more  tightly  packed  than 
both  empirical  and  naive  estimates.  In  the  case  of  the 
empirical  estimate,  botnet  data  results  in  nearly  twice 
as  many  addresses  per  block  for  prefix  lengths  between 
18  and  24  bits.  The  naive  estimate  is  zero  throughout 
these  results.  Based  on  the  results  from  Figure  2,  we  use 
empirical  estimation  throughout  the  rest  of  this  paper. 

Figure  3  compares  control  data  (empirically  esti¬ 
mated  populations)  against  each  of  our  four  datasets: 
spamming,  scanning,  botnet  population  and  phishing. 
In  comparison  to  the  population  plot  in  Figure  2,  these 
plots  represent  the  total  number  of  n-bit  blocks  observed 
for  that  population;  since  each  population  is  of  equal 
size,  the  lowest  line  will  have  the  highest  density.  For 
each  plot  in  Figure  3,  the  control  data  consists  1000  ran¬ 
dom  subsets  of  ^control  and  plotting  the  resulting  distri¬ 
bution  as  a  boxplot. 

Figure  3(i)  is  a  plot  of  the  comparative  volume  for 
A’-bot-  As  this  plot  shows,  the  population  of  72-bot  is  more 
densely  packed  than  the  expected  population  drawn 
from  7^Controi-  Figure  3(ii)  plots  the  volume  of  7vfphish 
reported  from  May  to  October,  2006.  We  use  a  five 
month  sample  due  to  the  smaller  size  of  the  phishing 
reports  in  comparison  to  the  other  reports.  As  noted  in 
Table  1,  the  6  month  phishing  report  is  approximately 
an  order  of  magnitude  smaller  than  the  other  unclean 
reports.  As  with  Figure  3(i),  addresses  in  the  phishing 


6 


Block  size  (Pefix  length) 


Figure  2:  Density  of  botnets  per  netblock,  compared  against  empirical  and  naive  control  sets 


report  are  more  tightly  packed  than  addresses  selected 
from  the  control  report. 

Figure  3(iii)  plots  the  volume  of  TZsparn  from  October 
1st  to  14th,  2006.  Figure  3(iv)  plots  the  volume  of  lZscan 
for  the  same  period.  Each  of  these  reports  is  more  tightly 
packed  than  the  comparative  control  reports. 

As  Figures  2  and  3  show,  unclean  reports  have  an  71- 
bit  density  greater  than  or  equal  to  or  greater  then  the 
?i-bit  density  of  the  control  reports  for  all  values  of  n. 
Consequently,  this  data  supports  the  spatial  uncleanli¬ 
ness  hypothesis:  compromised  hosts  are  disproportion¬ 
ately  concentrated  in  certain  networks. 

5  Temporal  Uncleanliness 

We  now  address  temporal  uncleanliness:  the  propen¬ 
sity  for  networks  to  remain  unclean  for  extended  peri¬ 
ods  of  time.  In  order  to  test  for  temporal  uncleanliness 
we  compare  the  ability  of  a  report  of  unclean  addresses 
to  predict  future  compromised  addresses;  in  particular, 
whether  or  not  a  report  of  bot  addresses  can  predict  fu¬ 
ture  bots,  spamming,  scanning  and  phishing. 

This  section  is  divided  as  follows:  §5.1  describes  our 


method  for  measuring  the  presence  of  temporal  unclean¬ 
liness,  and  §5.2  shows  the  results. 

5.1  Model  and  methodology 

To  observe  temporal  uncleanliness,  we  examine  the  pre¬ 
dictive  capacity  of  reports  of  unclean  data.  Consider 
three  reports :7£test>  "^control  and  7^reSuit-  If  latest  and 
^control  are  of  equal  cardinality,  then  TS-test  is  a  better 
predictor  of  the  report  7\’.resl,it  at  prefix  length  n  if: 

|  Cn  (7^-test)  0  Cn  (T^result)  |  > 

| C„  (^control)  H  Cn(1Z  result  )l  (4) 

If  temporal  uncleanliness  exists,  then  we  expect  that 
unclean  reports  will  consistently  be  better  predictors  of 
future  unclean  reports  than  a  control  report.  However, 
we  note  that  due  to  spatial  uncleanliness,  an  unclean  re¬ 
port  will  have  fewer  ?r-bit  CIDR  blocks  than  an  equiv¬ 
alent  control  report.  As  a  consequence,  as  block  size 
increases,  the  control  report  will  have  a  larger  number 
of  imprecise  successes.  Therefore,  there  will  be  some 
prefix  length  below  which  a  control  report  will  always 
be  a  better  predictor  than  the  test  report. 
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(i)  7^bot 


(ii)  7^-phish 


(iii)  7^-spam  (W)  Tuscan 

Figure  3:  Comparative  density  of  Unclean  netblocks  against  7Ccorltroi 


For  testing,  we  use  the  form  of  the  temporal  unclean¬ 
liness  hypothesis  given  in  the  equation  below.  Given 
that  7\Lunciean  and  7\LControi  have  equal  cardinality,  then 

3 n  €  [16, 32]  s.t. 

Kn (^-unclean)  FI  Cjl (^result)  I  >  |C<ra(^'Contro[)  FI  Cn 


That  is,  there  exists  a  prefix  length  where  a  previously 
generated  report  of  unclean  activity  is  more  predictive 
of  present  unclean  activity  than  a  control  report  of  equal 
cardinality.  As  with  spatial  uncleanliness,  we  limit  our 
analyses  to  CIDR  blocks  of  at  least  16  bits. 

5.2  Analysis 

We  now  test  the  temporal  uncleanliness  hypothesis  for¬ 
mulated  in  Equation  5.  To  do  so,  we  compare  the  effec¬ 


tive  predictiveness  of  7£bot  test  on  the  unclean  reports 
during  the  period  of  October  1st-  14th,  2006. 

Figure  4  shows  the  relative  predictive  capacity  of 
T^bot-test  against  future  unclean  reports;  for  these  fig¬ 
ures,  7\lphish  is  a  sub  report  of  7\lphish  from  Table  1.  This 
report  is  considerably  smaller  than  the  other  reports, 
resVMtifli1  2302  addresses.  This  results  in  a  smaller  degree  of 
intersection  with  the  randomly  generated  reports  from 
the  control  report. 

As  in  §4.2,  we  generate  the  reference  line  by  plot¬ 
ting  a  boxplot  showing  the  variance  of  1000  randomly 
selected  test  reports.  In  contrast  with  Figure  3,  the 
small  cardinality  of  7£bot-test  ensures  that  the  variations 
observed  by  the  boxplot  are  visible.  We  consider  the 
T^-bot-test  to  be  a  better  predictor  than  7vLcontro|  if  the  car¬ 
dinality  of  its  intersection  with  the  corresponding  un¬ 
clean  report  is  higher  than  the  intersection  with  ran¬ 
domly  selected  addresses  in  95%  of  the  observed  cases. 

As  Figure  4  shows,  TvLbot-test  is  a  better  predictor  than 


^•control  for  botnets,  spamming  and  scanning  at  various 
prefix  lengths.  Also  of  note  is  the  impact  of  spatial  un- 
cleanliness:  in  these  three  figures,  T^bot-test  is  a  bet¬ 
ter  predictor  for  prefix  lengths  of  approximately  19-20 
bits  and  longer.  At  shorter  prefix  lengths,  randomly  se¬ 
lected  addresses  become  better  predictors.  Using  the 
95%  threshold,  7£bot-test  is  a  stronger  predictor  of  fu¬ 
ture  botnet  activity  between  20  and  25  bits,  spamming 
between  19  and  32  bits,  and  scanning  between  20  and  24 
bits.  For  prefix  lengths  longer  than  these  values,  the  two 
reports  are  equally  predictive  due  to  the  low  probability 
of  seeing  CIDR  blocks  from  either  report  intersect. 

Figure  4(ii)  plots  the  predictive  capacity  of  Ti’-bot  -  test 
against  7^.phish  -  In  contrast  to  the  other  plots  in  Figure  4, 
this  plot  indicates  that  7£bot-test  is  not  a  good  predictor 
of  future  phishing  activity  in  comparison  to  randomly 
selected  control  sets. 

We  have  two  hypotheses  as  to  why  phishing  this  is 
so:  Ramachandran  et  al.  [22]  describe  how  botnet  own¬ 
ers  place  a  higher  premium  on  addresses  that  have  not 
yet  been  identified  as  bots.  Because  phishing  sites  need 
to  be  publicized,  a  phishing  IP  address  becomes  pub¬ 
lic  knowledge,  marked  on  blacklists  and  consequently 
highly  unattractive  for  the  owner  of  a  botnet. 

An  alternative  explanation  is  that,  in  contrast  to  bot¬ 
nets,  phishing  sites  are  generally  hosted  on  web  servers, 
and  a  phisher  may  prefer  to  host  phishing  sites  in  a 
actual  datacenter  to  ensure  robustness  during  a  flash 
crowd.  At  the  minimum,  a  phishing  site  must  be  pub¬ 
licly  accessible,  while  a  bot  can  exist  behind  a  NAT  or  a 
firewall  and  still  be  useful.  Therefore,  phishers  may  pre¬ 
fer  sites  that  are  already  hosting  web  servers  and  have 
the  resources  to  handle  a  high  traffic  load. 

In  order  to  determine  whether  the  temporal  uncleanli¬ 
ness  hypothesis  does  hold  for  phishing,  we  now  consider 
a  test  that  uses  phishing  data  exclusively.  Figure  5  plots 
the  intersection  of  TvLphish-test  against  the  same  phishing 
set  as  in  Figure  4(ii).  In  this  case,  |72-phish— test  |  =  1386. 
We  note  that  this  figure  shows  strong  evidence  for  tem¬ 
poral  uncleanliness  in  phishing. 

Since  these  results  show  that  five  month  old  reports 
can  be  used  to  more  effectively  predict  the  population  of 
future  reports  than  randomly  selected  IP  addresses  from 
a  week  before,  we  conclude  that  the  temporal  uncleanli¬ 
ness  hypothesis  is  supported  by  this  data.  Furthermore, 
in  Equation  5,  we  chose  a  range  of  IP  blocks  arbitrarily, 
we  can  now  establish  a  lower  limit  for  the  prefix  length 
of  20  bits,  an  an  upper  limit  in  excess  of  24  bits. 

We  have  also  shown  that  phishing  activity  and  botnet 
activity  are  not  related  in  the  way  that  bots,  scanning  and 
spamming  are.  As  noted  elsewhere  [21,  15],  scanning 


and  spamming  are  commonly  implemented  with  bot¬ 
nets,  so  we  would  expect  that  7?-bot,  Tuscan  and  7^spam  are 
related.  However,  the  inability  of  A-bot  test  to  predict 
future  phishing  activity  suggests  that  a  measurement  for 
uncleanliness  will  have  to  be  multidimensional:  phish¬ 
ing  sites  are  still  taken  over,  but  it  may  be  that  phishers 
have  different  criteria  for  the  machines  they  occupy  than 
botnet  owners. 

6  Blocking  Tests 

The  spatial  and  temporal  uncleanliness  hypotheses  to¬ 
gether  provide  a  method  for  identifying  compromised 
hosts.  Spatial  uncleanliness  implies  that  if  an  address 
within  a  network  is  occupied,  then  we  can  expect  other 
networks  within  the  same  netblock  to  be  occupied.  Tem¬ 
poral  uncleanliness  indicates  that  if  we  have  seen  an  ad¬ 
dress  in  the  past  used  for  an  attack,  then  we  can  expect 
addresses  from  the  same  network  to  do  so  in  the  future. 

We  now  address  the  issue  of  whether  unclean  net¬ 
works  can  be  effectively  blocked;  that  is,  whether  or  not 
blocking  a  set  of  unclean  networks  will  adversely  affect 
traffic  into  an  active  network.  To  do  so,  we  will  examine 
the  impact  of  blocking  a  set  of  unclean  networks  from 
two  weeks  of  network  traffic.  In  this  section,  we  com¬ 
pare  the  expected  false  positive  and  false  negative  values 
over  a  range  of  operating  characteristic  values.  For  this 
work,  the  operating  characteristic  is  n,  the  CIDR  block 
prefix  length. 

We  begin  by  collecting  traffic  logs  of  all  traffic  that 
crosses  the  observed  network  from  all  IP  addresses  i  C 
C24  (T^-bot— test )  for  the  test  period  of  October  1st- 14th 
2006.  This  report,  ^candidate  consists  of  all  IP  addresses 
crossing  the  observed  network  which  share  a  /24  in  com¬ 
mon  with  any  of  the  IP  addresses  in  T^bot-  test-  This  al¬ 
lows  us  to  test  the  effectiveness  of  filtering  from  the  /24 
to  the  / 32  range;  we  pick  this  range  because,  as  seen  in 
Figure  3,  24  bits  is  the  minimum  block  size  at  which 
^•bot— test  is  an  unambiguously  better  predictor  of  future 
uncleanliness  than  control  data.  We  further  constrain 
"^candidate  to  those  addresses  which  generate  at  least  one 
TCP  record  during  this  time  period. 

The  source  data  used  for  this  analysis  is  CISCO  Net- 
Flow5  traffic  records,  which  are  a  compact  summariza¬ 
tion  of  traffic  information,  but  do  not  contain  payload. 
As  a  consequence,  our  analysis  will  have  some  degree 
of  uncertainty  as  we  cannot  directly  validate  the  pay- 
load.  We  will  therefore  differentiate  addresses  in  two 
ways:  by  membership  in  one  of  the  unclean  reports  and 

5 http : / /www .cisco . com/ go /net f low 
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Prefix  length  Prefix  length 


(iii)  VS.  7^.Spam 


(iv)  vs.  ns, 


Figure  4:  Comparative  predictive  capacity  of  '/?.bot  test  against  control  data 


by  behavior. 

We  partition  the  7\lcandidate  into  three  reports: 
"^unknown,  "^hostile  and  7£innocent.  A  full  inventory  of  the 
reports  used  in  this  analysis  is  given  in  Table  2. 

^hostile  consists  of  any  IP  address  in  7\lcandidate  that  is 
also  present  in  the  unclean  reports  (i.e.,  scanning,  spam¬ 
ming,  phishing  or  botnet  membership).  The  hostile  set  is 
identified  purely  by  intersecting  these  reports,  and  once 
an  IP  address  is  identified  as  hostile  it  cannot  be  present 
in  the  remaining  two  reports.  7?.unknown  is  comprised  of 
the  addresses  in  7\lcandidate  address  which  are  not  present 
in  one  of  the  unclean  reports,  but  have  no  payload  bear¬ 
ing  flows.  We  define  a  flow  as  payload-bearing  if  it  is  a 
TCP  flow  with  at  least  36  bytes  of  payload  and  at  least 
one  ACK  flag.  Due  to  TCP  options,  a  3-packet  SYN 
scan  will  often  have  36  bytes  of  payload,  even  though 
this  data  is  still  part  of  the  TCP  handshake.  Hand- 
examination  of  the  flow  logs  found  multiple  examples 


of  36-byte  SYN-only  scans  to  apparently  random  ports 
on  distributed  targets. 

The  IP  addresses  in  ^unknown  are  not  proven  to  be 
hostile  but  are  highly  suspicious.  Due  to  the  lack  of 
payload  in  flow  data,  we  cannot  definitively  categorize 
members  of  this  report  into  either  of  the  other  two  re¬ 
ports  and  consequently  we  remove  them  from  the  false 
positive  and  false  negative  calculations. 

The  population  of  7\L;nnocent  consequently  consists  of 
any  IP  address  which  does  conduct  payload-bearing 
TCP  activity  and  is  not  present  in  any  of  the  unclean 
reports. 

Our  prediction  scenario  assumes  that  an  organization 
received  Tvlbot-test  and  is  blocking  CnlZbot-test  for  some 
value  of  n  £  [24,32].  The  success  of  this  defensive 
mechanism  is  based  on  how  many  hostile  and  innocent 
addresses  are  blocked  by  the  attack  mechanism  (as  noted 
above,  while  the  unknown  population  is  calculated  and 
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Figure  5:  Comparative  predictive  capacity  of  phishing  reports 


Reports  used  for  prediction  testing 

Tag 

Type 

Class 

Valid  Dates 

Size 

Reporting  method 

unclean 

Provided 

Special 

2006/10/01-2006/10/14 

1,158,103 

The  union  of  the  four  unclean  re¬ 
ports,  note  that  there  is  overlap 

candidate 

Observed 

N/A 

2006/10/01-2006/10/14 

1030 

IP  Addresses  crossing  the  network 
border  and  which  are  in  the  same 
/24  S  as  7^.unc|ean 

hostile 

Observed 

N/A 

2006/10/01-2006/10/14 

287 

Members  of  ^candidate  also  present 

^-unclean 

unknown 

Observed 

N/A 

2006/10/01-2006/10/14 

708 

Members  of  7?.candidate  not  in 
T^-unciean,  but  engaged  in  suspicious 
activity 

innocent 

Observed 

N/A 

2006/10/01-2006/10/14 

35 

Members  of  ^candidate  not  present  in 
^-hostile  Or  7^unknown 

Table  2:  Table  of  reports  used  for  prediction  test 


analyzed  in  this  exercise,  it  is  not  scored).  The  score  for 
the  defensive  mechanism  is  the  relative  success,  mea¬ 
sured  in  true  and  false  positives  of  the  filter  as  a  function 
of  n.  We  define  a  false  positive  as  a  member  of  7^innoCent 
blocked  by  the  filter,  and  true  positive  as  a  member  of 
^hostile  blocked  by  the  filter. 

To  calculate  the  true  and  false  positive  rates,  we  define 
a  membership  function,  to: 


m(i,  S) 


1  C32{i)  C  C32(S) 
0  otherwise 


(6) 


pop(n)  = 

^  TO  (*>  ^candidate  H  (1Z\  nnocent  U  ^hostile)) 

iCCn  (TZb  ot  — test) 

(7) 


As  noted  above,  this  calculation  explicitly  avoids  the 
use  of  7?.uriknown-  We  calculate  the  true  positive  and  true 
negative  values  by  calculating  a  similar  value  over  the 
various  reports: 


For  any  prefix  length  n,  we  calculate  the  population 

as  a  function  of  n  by  summing  the  unique  IP  addresses  TP(n)  =  ^  m(i,  Kcandl date  FI  ^hostile)  (8) 

that  appear  within  the  ftbot-test  (Kbot_test) 
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FP(ra)  = 


E 


i(i,  1Z, 


candidate 


mz-inn 


iCC„CRb  ot  —  test) 


(9) 

Table  3  summarizes  the  effectiveness  of  this  predic¬ 
tion  method.  As  this  table  shows,  all  three  populations 
increase  as  the  bit  length  increases.  At  n  =  24,  90%  of 
the  incoming  addresses  are  correctly  identified  as  hos¬ 
tile.  If  we  assume  that  unknown  address  are  hostile,  true 
positive  rate  is  97%.  Furthermore,  the  false  positive  rate 
remains  relatively  low  until  n  =  26. 


n 

TP(n) 

FP(n) 

pop(n) 

^-unknown 

24 

287 

35 

322 

708 

25 

172 

22 

194 

344 

26 

81 

1 

82 

200 

27 

38 

1 

39 

105 

28 

18 

0 

18 

60 

29 

7 

0 

7 

29 

30 

1 

0 

1 

14 

31 

1 

0 

1 

7 

32 

1 

0 

1 

0 

Table  3:  Observed  true  and  false  positive  counts 


Of  note  with  this  dataset  are  the  volume  of  uncer¬ 
tain  addresses  (i.e.,  the  population  of  7\Lunknown)-  At  a 
24  bit  prefix  length,  |C24(7^-b0t— test)  ^  ^24  (^unknown)! 
yields  approximately  700  addresses.  We  first  note  that 
unknown  addresses  have  engaged  in  TCP  communica¬ 
tions,  but  have  not  exchanged  payload  -  consequently, 
blocking  these  addresses  does  not  impact  traffic. 

Of  more  concern  is  that  all  of  the  addresses  in 
^unknown  engage  in  some  form  of  suspicious  behavior 
(that  is,  suspicious  apart  from  trying  to  connect  with 
the  network  and  not  exchanging  payload).  Hand  exam¬ 
ination  found  many  address  trying  to  open  communica¬ 
tions  from  ephemeral  ports  to  ephemeral  ports  (notably 
port  TCP/51736)  and  slow  scanning  (the  scan  detection 
mechanism  is  calibrated  to  identify  scans  that  take  place 
over  an  hour,  scans  observed  in  this  dataset  would  often 
contact  less  than  30  addresses  a  day  over  the  course  of 
the  test  period). 

The  strength  of  this  blocking  method  is  predicated  on 
the  relatively  sparse  amount  of  traffic  issuing  from  these 
netblocks.  As  Table  3  shows,  1030  IP  addresses  were 
blocked  when  n  was  set  to  24  bits.  |C24(7£bot-test)|  = 
173,  which  yields  a  potential  set  of  44,288  address  that 
can  be  blocked.  Consequently,  less  than  2%  of  the  total 
IP  addresses  available  in  those  /24s  communicated  with 


the  observed  network  during  this  time. 

Some  of  the  effectiveness  of  this  method  may  be  at¬ 
tributed  to  the  demographics  of  the  botnet  and  the  net¬ 
work  72-bot— test  consists  primarily  of  addresses  outside 
the  English-speaking  world,  with  70%  of  the  addresses 
coming  from  Turkey.  In  addition,  the  network  under  ob¬ 
servation  is  primarily  an  edge  network;  that  is,  all  traffic 
at  its  border  is  either  originating  from  an  address  within 
that  border  or  going  to  an  IP  address  within  that  border. 
Therefore,  while  we  have  shown  that  a  five-month  old 
botnet  can  still  be  used  to  effectively  predict  and  halt 
hostile  traffic,  issues  of  demographics  and  a  network’s 
target  audience  must  also  be  evaluated. 


7  Conclusion 

In  this  paper,  we  have  demonstrated  that  it  is  possible  to 
effectively  predict  future  hostile  activity  from  past  net¬ 
work  activity.  To  do  so,  we  have  defined  a  network- 
based  quality  of  uncleanliness,  which  is  an  indicator  of 
how  likely  a  network  is  to  contain  compromised  hosts. 

As  an  initial  work  in  this  field,  we  have  focused  on 
testing  basic  hypotheses  about  uncleanliness,  which  we 
have  defined  with  the  spatial  and  temporal  uncleanliness 
hypotheses.  Using  reports  of  network  activity  and  traf¬ 
fic  logs  of  a  large  network  we  have  shown  evidence  of 
spatial  and  temporal  uncleanliness.  We  have  also  shown 
that  an  uncleanliness  measure  may  involve  multiple  di¬ 
mensions,  such  as  botnets  and  phishing. 

Finally,  we  have  demonstrated  that  spatial  and  tem¬ 
poral  uncleanliness,  coupled  with  the  limited  audience 
of  an  edge  network,  can  be  effectively  used  to  block 
hostile  traffic  in  the  future.  Given  the  demographics  is¬ 
sues  noted  in  §6,  uncleanliness  may  best  be  used  as  a 
risk  indicator  -  by  showing  that  a  network  is  demonstrat¬ 
ing  in  unclean  behavior,  security  personnel  can  evaluate 
whether  the  risk  of  hostile  activity  from  the  network  is 
worth  the  benefit  of  receiving  commerce  and  communi¬ 
cation  from  that  network  under  normal  circumstances. 

Our  immediate  goal  following  this  work  is  to  develop 
a  more  rigorous  metric  for  uncleanliness,  in  particu¬ 
lar  a  multidimensional  uncleanliness  metric  to  measure 
the  aggregate  probability  that  an  address  is  occupied. 
The  elements  of  this  metric  involve  the  components  dis¬ 
cussed  in  this  work  as  well  as  other  predictive  indica¬ 
tors  of  vulnerability  (communication  with  botnet  C&C 
nodes). 

We  also  believe  that  spatial  uncleanliness,  in  partic¬ 
ular,  has  useful  implications  for  network  log  analysis. 
If  we  know  that  a  host  from  one  network  is  attacking, 
scanning  or  otherwise  interfering  with  the  traffic  on  an 
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observed  network,  it  is  reasonable  to  examine  other  traf¬ 
fic  from  that  network  to  see  if  there  is  coordinated  hos¬ 
tile  activity. 
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